A New Ensemble Diversity Measure Applied to Thinning Ensembles
نویسندگان
چکیده
We introduce a new way of describing the diversity of an ensemble of classifiers, the Percentage Correct Diversity Measure, and compare it against existing methods. We then introduce two new methods for removing classifiers from an ensemble based on diversity calculations. Empirical results for twelve datasets from the UC Irvine repository show that diversity is generally modeled by our measure and ensembles can be made smaller without loss in accuracy.
منابع مشابه
Ensemble diversity measures and their application to thinning
The diversity of an ensemble can be calculated in a variety of ways. Here a diversity metric and a means for altering the diversity of an ensemble, called “thinning”, are introduced. We experiment with thinning algorithms evaluated on ensembles created by several techniques on 22 publicly available datasets. When compared to other methods, our percentage correct diversity measure algorithm show...
متن کاملHierarchical cluster ensemble selection
Clustering ensemble performance is affected by two main factors: diversity and quality. Selection of a subset of available ensemble members based on diversity and quality often leads to a more accurate ensemble solution. However, there is not a certain relationship between diversity and quality in selection of subset of ensemble members. This paper proposes the Hierarchical Cluster Ensemble Sel...
متن کاملThe ensemble clustering with maximize diversity using evolutionary optimization algorithms
Data clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic clustering methods, most studies today are guided by clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also...
متن کاملModerate diversity for better cluster ensembles
Adjusted Rand index is used to measure diversity in cluster ensembles and a diversity measure is subsequently proposed. Although the measure was found to be related to the quality of the ensemble, this relationship appeared to be non-monotonic. In some cases, ensembles which exhibited a moderate level of diversity gave a more accurate clustering. Based on this, a procedure for building a cluste...
متن کاملA High-Performance Model based on Ensembles for Twitter Sentiment Classification
Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003